Philosophy is essential because it can help us think critically. This project is to answer whether man is more important in the History of Philosophy or not. The meaning of Man in Philosophy is defined as a human being and human nature. From "The Conception of Man in Existential Philosophy", man is unity and a totality.
The dataset that I use to do analysis is holding at kaggle. It contains about 300,000 sentences and 13 schools. The main techniques that I use to explore the data are WordCloud and Term Frequency — Inverse Document Frequency (TF-IDF).
import numpy as np
import pandas as pd
import nltk
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer, TfidfTransformer
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
import matplotlib.pyplot as plt
df=pd.read_csv("../data/philosophy_data.csv")
df.head()
| title | author | school | sentence_spacy | sentence_str | original_publication_date | corpus_edition_date | sentence_length | sentence_lowered | tokenized_txt | lemmatized_str | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Plato - Complete Works | Plato | plato | What's new, Socrates, to make you leave your ... | What's new, Socrates, to make you leave your ... | -350 | 1997 | 125 | what's new, socrates, to make you leave your ... | ['what', 'new', 'socrates', 'to', 'make', 'you... | what be new , Socrates , to make -PRON- lea... |
| 1 | Plato - Complete Works | Plato | plato | Surely you are not prosecuting anyone before t... | Surely you are not prosecuting anyone before t... | -350 | 1997 | 69 | surely you are not prosecuting anyone before t... | ['surely', 'you', 'are', 'not', 'prosecuting',... | surely -PRON- be not prosecute anyone before ... |
| 2 | Plato - Complete Works | Plato | plato | The Athenians do not call this a prosecution b... | The Athenians do not call this a prosecution b... | -350 | 1997 | 74 | the athenians do not call this a prosecution b... | ['the', 'athenians', 'do', 'not', 'call', 'thi... | the Athenians do not call this a prosecution ... |
| 3 | Plato - Complete Works | Plato | plato | What is this you say? | What is this you say? | -350 | 1997 | 21 | what is this you say? | ['what', 'is', 'this', 'you', 'say'] | what be this -PRON- say ? |
| 4 | Plato - Complete Works | Plato | plato | Someone must have indicted you, for you are no... | Someone must have indicted you, for you are no... | -350 | 1997 | 101 | someone must have indicted you, for you are no... | ['someone', 'must', 'have', 'indicted', 'you',... | someone must have indict -PRON- , for -PRON- ... |
print('The shape of dataset: ', df.shape)
print('Missing Values in dataset: ', df.isnull().sum().sum())
The shape of dataset: (360808, 11) Missing Values in dataset: 0
From the dataset Overview, we can see there are 360,808 rows and 11 columns, and there aren't missing values. The first question that I asked myself was: How do I know gender is important in Philosophy? I am going to filter out some columns that I think I may not need during my process. I mainly focus on the author, school, original_publication_date, corpus_edition_date, and sentence_lowered because I would like to use Word Clouds to visual the most frequent words in each school.
# Remove sentence_spacy, sentence_str,
final_df = df[['title','author','school', 'original_publication_date', 'corpus_edition_date', 'sentence_lowered']]
final_df.head()
| title | author | school | original_publication_date | corpus_edition_date | sentence_lowered | |
|---|---|---|---|---|---|---|
| 0 | Plato - Complete Works | Plato | plato | -350 | 1997 | what's new, socrates, to make you leave your ... |
| 1 | Plato - Complete Works | Plato | plato | -350 | 1997 | surely you are not prosecuting anyone before t... |
| 2 | Plato - Complete Works | Plato | plato | -350 | 1997 | the athenians do not call this a prosecution b... |
| 3 | Plato - Complete Works | Plato | plato | -350 | 1997 | what is this you say? |
| 4 | Plato - Complete Works | Plato | plato | -350 | 1997 | someone must have indicted you, for you are no... |
pd.DataFrame(final_df.groupby(by = ['title', 'author', 'school','original_publication_date', 'corpus_edition_date'])['title'].count())
| title | |||||
|---|---|---|---|---|---|
| title | author | school | original_publication_date | corpus_edition_date | |
| A General Theory Of Employment, Interest, And Money | Keynes | capitalism | 1936 | 2003 | 3411 |
| A Treatise Concerning The Principles Of Human Knowledge | Berkeley | empiricism | 1710 | 2009 | 1040 |
| A Treatise Of Human Nature | Hume | empiricism | 1739 | 2003 | 7047 |
| Anti-Oedipus | Deleuze | continental | 1972 | 1997 | 6679 |
| Aristotle - Complete Works | Aristotle | aristotle | -320 | 1991 | 48779 |
| Being And Time | Heidegger | phenomenology | 1927 | 1996 | 8505 |
| Beyond Good And Evil | Nietzsche | nietzsche | 1886 | 2003 | 1906 |
| Capital | Marx | communism | 1883 | 1887 | 12996 |
| Critique Of Judgement | Kant | german_idealism | 1790 | 2007 | 4204 |
| Critique Of Practical Reason | Kant | german_idealism | 1788 | 2002 | 2452 |
| Critique Of Pure Reason | Kant | german_idealism | 1781 | 1998 | 7472 |
| Dialogues Concerning Natural Religion | Hume | empiricism | 1779 | 2009 | 1265 |
| Difference And Repetition | Deleuze | continental | 1968 | 1994 | 5861 |
| Discourse On Method | Descartes | rationalism | 1637 | 2008 | 340 |
| Ecce Homo | Nietzsche | nietzsche | 1888 | 2016 | 1504 |
| Elements Of The Philosophy Of Right | Hegel | german_idealism | 1820 | 1991 | 4923 |
| Enchiridion | Epictetus | stoicism | 125 | 2014 | 323 |
| Essay Concerning Human Understanding | Locke | empiricism | 1689 | 2004 | 7742 |
| Essential Works Of Lenin | Lenin | communism | 1862 | 1966 | 4469 |
| Ethics | Spinoza | rationalism | 1677 | 2003 | 3304 |
| History Of Madness | Foucault | continental | 1961 | 2006 | 8033 |
| Lewis - Papers | Lewis | analytic | 1985 | 2008 | 13120 |
| Meditations | Marcus Aurelius | stoicism | 170 | 2008 | 2212 |
| Meditations On First Philosophy | Descartes | rationalism | 1641 | 2008 | 792 |
| Naming And Necessity | Kripke | analytic | 1972 | 1990 | 2681 |
| Off The Beaten Track | Heidegger | phenomenology | 1950 | 2001 | 6734 |
| On Certainty | Wittgenstein | analytic | 1950 | 1969 | 1984 |
| On The Improvement Of Understanding | Spinoza | rationalism | 1677 | 1997 | 489 |
| On The Principles Of Political Economy And Taxation | Ricardo | capitalism | 1817 | 2010 | 3090 |
| Philosophical Investigations | Wittgenstein | analytic | 1953 | 1986 | 5838 |
| Philosophical Studies | Moore | analytic | 1910 | 2015 | 3668 |
| Philosophical Troubles | Kripke | analytic | 1975 | 2011 | 9798 |
| Plato - Complete Works | Plato | plato | -350 | 1997 | 38366 |
| Quintessence | Quine | analytic | 1950 | 2004 | 7373 |
| Science Of Logic | Hegel | german_idealism | 1817 | 2010 | 10678 |
| Second Treatise On Government | Locke | empiricism | 1689 | 2010 | 1143 |
| The Analysis Of Mind | Russell | analytic | 1921 | 2008 | 3513 |
| The Antichrist | Nietzsche | nietzsche | 1888 | 2006 | 1170 |
| The Birth Of The Clinic | Foucault | continental | 1963 | 2003 | 2518 |
| The Communist Manifesto | Marx | communism | 1848 | 1970 | 493 |
| The Crisis Of The European Sciences And Phenomenology | Husserl | phenomenology | 1936 | 1970 | 4832 |
| The Idea Of Phenomenology | Husserl | phenomenology | 1907 | 1999 | 910 |
| The Logic Of Scientific Discovery | Popper | analytic | 1959 | 2002 | 4678 |
| The Order Of Things | Foucault | continental | 1966 | 2002 | 4689 |
| The Phenomenology Of Perception | Merleau-Ponty | phenomenology | 1945 | 2002 | 7592 |
| The Phenomenology Of Spirit | Hegel | german_idealism | 1807 | 1977 | 7099 |
| The Problems Of Philosophy | Russell | analytic | 1912 | 2004 | 1560 |
| The Search After Truth | Malebranche | rationalism | 1674 | 1997 | 12997 |
| The Second Sex | Beauvoir | feminism | 1949 | 2009 | 13017 |
| The System Of Ethics | Fichte | german_idealism | 1798 | 2005 | 5308 |
| The Wealth Of Nations | Smith | capitalism | 1776 | 2009 | 11693 |
| Theodicy | Leibniz | rationalism | 1710 | 2005 | 5027 |
| Three Dialogues | Berkeley | empiricism | 1713 | 2009 | 1694 |
| Thus Spake Zarathustra | Nietzsche | nietzsche | 1887 | 2008 | 5916 |
| Tractatus Logico-Philosophicus | Wittgenstein | analytic | 1921 | 2001 | 1212 |
| Twilight Of The Idols | Nietzsche | nietzsche | 1888 | 2016 | 3052 |
| Vindication Of The Rights Of Woman | Wollstonecraft | feminism | 1792 | 2001 | 2559 |
| Women, Race, And Class | Davis | feminism | 1981 | 1981 | 3059 |
| Writing And Difference | Derrida | continental | 1967 | 2001 | 5999 |
# The range of original and edtion date
print('The range of original publication date: ')
print([final_df['original_publication_date'].min(), final_df['original_publication_date'].max()] )
print('The range of corpus edition date: ')
print( [final_df['corpus_edition_date'].min(), final_df['corpus_edition_date'].max()])
The range of original publication date: [-350, 1985] The range of corpus edition date: [1887, 2016]
From the range of the original Publication data, It was starting from 350 to 1985. It reflects males dominate the culture that we are living in. In the past, a male had more opportunites to get eductions. Therefore, man has more opportunites to write Philosophy and express their thougths and ideas.
# need pip3 install wordcloud
# referencing from "https://www.kaggle.com/docxian/history-of-philosophy-eda-word2vec-model"
stopwords = set(STOPWORDS)
schools = final_df.school.unique().tolist()
for school in schools:
new_df = df[df.school==school]
print('School = ', school)
words = " ".join(sentence for sentence in new_df.sentence_lowered)
wordcloud = WordCloud(stopwords=stopwords, max_font_size=50, max_words=500,
width = 600, height = 400,
background_color="white").generate(words)
plt.figure(figsize=(12,8))
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
#plt.savefig('wordCloud_' + school+'.png', dpi=300)
plt.show()
School = plato
School = aristotle
School = empiricism
School = rationalism
School = analytic
School = continental
School = phenomenology
School = german_idealism
School = communism
School = capitalism
School = stoicism
School = nietzsche
School = feminism
From the above WordClouds, we can easily see "man" or "men" in each school's WordCloud except for Capitalism and German_idealism. A man got mentioned a lot in those works. Other than "man," there are some words that appear a lot as well, such as "good," "think," "ting," and so on. Therefore, Man is essential for Philosophy. Back in our history, man is the control of the world. He could help us think correctly, act like good people, and do the right things.
# Statistic for the top 10 words
# code is referening from "https://investigate.ai/text-analysis/counting-words-with-scikit-learns-countvectorizer/"
for school in schools:
new_df = df[df.school==school]
print('School = ', school, ':')
text = " ".join(sentence for sentence in new_df.sentence_lowered)
# print(type(words))
vectorizer = CountVectorizer(stop_words='english')
matrix = vectorizer.fit_transform([text])
counts = pd.DataFrame(matrix.toarray(),
columns=vectorizer.get_feature_names())
print(counts.T.sort_values(by=0, ascending=False).head(10))
School = plato :
0
things 2930
say 2672
said 2331
good 2263
man 2180
think 2177
just 1950
way 1926
socrates 1866
people 1725
School = aristotle :
0
things 4461
man 4125
thing 2905
does 2819
good 2627
animals 2508
time 2379
case 2305
like 2164
body 2058
School = empiricism :
0
ideas 3486
idea 2385
mind 1942
men 1580
man 1416
things 1270
reason 1189
nature 1174
make 1167
power 990
School = rationalism :
0
god 3534
things 2349
mind 2188
body 1829
nature 1539
good 1396
reason 1391
man 1319
soul 1297
order 1281
School = analytic :
0
say 3526
true 3055
sense 2518
does 2496
case 2381
theory 2138
know 1943
way 1915
language 1853
world 1835
School = continental :
0
madness 2283
form 1913
language 1630
time 1624
order 1528
difference 1449
nature 1414
does 1414
man 1367
thought 1362
School = phenomenology :
0
world 4351
time 1813
way 1802
does 1682
dasein 1635
consciousness 1522
beings 1391
sense 1387
present 1368
knowledge 1368
School = german_idealism :
0
concept 4299
self 3833
reason 3634
nature 3434
consciousness 3224
object 2973
pure 2584
form 2525
existence 2511
does 2326
School = communism :
0
labour 3311
value 2508
capital 1547
production 1453
power 1095
time 1050
work 1027
form 926
working 914
means 908
School = capitalism :
0
price 2328
money 1906
labour 1786
value 1767
great 1758
capital 1648
country 1597
quantity 1514
produce 1506
greater 1318
School = stoicism :
0
thou 798
things 568
unto 436
man 341
thy 338
thee 306
nature 272
doth 247
good 195
thyself 193
School = nietzsche :
0
man 1068
thou 867
ye 694
life 680
zarathustra 626
god 581
good 550
like 524
things 523
great 491
School = feminism :
0
woman 3125
women 3071
man 1870
men 1216
life 1005
love 986
does 816
like 806
black 758
mother 713
# graph top 10 words
for school in schools:
new_df = df[df.school==school]
text = " ".join(sentence for sentence in new_df.sentence_lowered)
vectorizer = CountVectorizer(stop_words='english')
matrix = vectorizer.fit_transform([text])
counts = pd.DataFrame(matrix.toarray(),
columns=vectorizer.get_feature_names())
top_10 = counts.T.sort_values(by = 0, ascending=False).head(10)
top_10.plot(title = school, kind = 'bar')
From the above graphs, man is ranking 5th in Plato, second on Aristotle, 4th on Empiricism, 8th on Rationalism, 9th on Continental, 4th on Stoicism, 1st on Nietzsche, and 3rd on Feminism.
# code referencing from "https://kavita-ganesan.com/tfidftransformer-tfidfvectorizer-usage-differences/#.YVDK_mZucq0"
for school in schools:
new_df = df[df.school==school]
print('School = ', school, ':')
cv=CountVectorizer(stop_words='english')
word_count_vector=cv.fit_transform(new_df.sentence_lowered)
tfidf_transformer=TfidfTransformer(smooth_idf=True,use_idf=True)
tfidf_transformer.fit(word_count_vector)
df_idf = pd.DataFrame(tfidf_transformer.idf_, index=cv.get_feature_names(),columns=["idf_weights"])
# sort ascending and decending
top_10_asending = df_idf.sort_values(by=['idf_weights'], ascending=True).head(10)
top_10_decending = df_idf.sort_values(by=['idf_weights'], ascending=False).head(10)
print("asending:", top_10_asending)
print("decending:", top_10_decending)
top_10_asending.plot(title = school, kind = 'bar')
School = plato : asending: idf_weights say 3.704849 things 3.707581 said 3.829623 think 3.894368 good 3.956053 man 3.980908 just 4.032553 way 4.036889 socrates 4.059967 people 4.149457 decending: idf_weights ψυχη 10.861806 resided 10.861806 remonstrates 10.861806 faintly 10.861806 fainthearted 10.861806 remotely 10.861806 remoter 10.861806 remotest 10.861806 removable 10.861806 rending 10.861806 School = aristotle : asending: idf_weights things 3.586039 man 3.750770 thing 3.985534 does 3.999018 animals 4.102962 case 4.149678 time 4.231356 like 4.245467 good 4.257646 way 4.259245 decending: idf_weights katatheis 11.101928 landanimal 11.101928 laments 11.101928 lametic 11.101928 laminar 11.101928 lampholder 11.101928 lamplight 11.101928 lampros 11.101928 lances 11.101928 lancet 11.101928 School = empiricism : asending: idf_weights ideas 2.950285 idea 3.394589 mind 3.412908 men 3.651578 man 3.789386 things 3.866575 nature 3.894293 reason 3.910746 make 3.918147 certain 4.151322 decending: idf_weights zones 10.206935 ferries 10.206935 fermentations 10.206935 feraces 10.206935 reversion 10.206935 felled 10.206935 reverso 10.206935 feint 10.206935 feigns 10.206935 reverted 10.206935 School = rationalism : asending: idf_weights god 3.057292 things 3.413529 mind 3.485168 body 3.763135 nature 3.827305 reason 3.915790 man 3.982315 good 4.006685 order 4.014646 men 4.028057 decending: idf_weights aa 10.347926 praesertim 10.347926 predicates 10.347926 fatigationem 10.347926 fatigued 10.347926 predica 10.347926 fatis 10.347926 predetermining 10.347926 fatten 10.347926 fattened 10.347926 School = analytic : asending: idf_weights say 3.822036 true 4.082098 does 4.167037 sense 4.212599 case 4.232518 theory 4.373195 way 4.416212 fact 4.487956 know 4.489138 think 4.524630 decending: idf_weights linguistische 11.229657 merated 11.229657 mendelson 11.229657 mengenlehre 11.229657 menna 11.229657 mentahty 11.229657 mentalphysical 11.229657 mented 11.229657 mentinn 11.229657 mentor 11.229657 School = continental : asending: idf_weights madness 3.854607 form 3.968285 time 4.144863 order 4.207250 does 4.235442 language 4.241480 nature 4.293530 difference 4.350126 thought 4.356051 man 4.386213 decending: idf_weights aaron 10.734477 maitres 10.734477 majest 10.734477 majorwriting 10.734477 majus 10.734477 makespossible 10.734477 mala 10.734477 malad 10.734477 maladie 10.734477 maldiney 10.734477 School = phenomenology : asending: idf_weights world 3.152834 way 3.815419 does 3.898242 dasein 3.940388 time 3.946366 sense 4.105637 consciousness 4.116635 beings 4.139808 present 4.187831 things 4.244540 decending: idf_weights aa 10.567105 foi 10.567105 probes 10.567105 probed 10.567105 prnesitum 10.567105 prizing 10.567105 prize 10.567105 privativum 10.567105 fohren 10.567105 foil 10.567105 School = german_idealism : asending: idf_weights concept 3.557054 self 3.591938 reason 3.634346 nature 3.707742 consciousness 3.803265 object 3.869215 pure 3.948386 form 3.955200 existence 3.981991 does 3.988096 decending: idf_weights flying 10.955534 glare 10.955534 givens 10.955534 giver 10.955534 giveth 10.955534 quem 10.955534 quelque 10.955534 gj 10.955534 gl 10.955534 quel 10.955534 School = communism : asending: idf_weights labour 3.015962 value 3.335932 capital 3.640450 production 3.668153 time 3.925793 power 3.943604 work 3.984602 social 4.097579 means 4.098812 working 4.105004 decending: idf_weights aargau 10.102699 maatschappij 10.102699 luxemburg 10.102699 luxuriance 10.102699 luxuriously 10.102699 luxuriousness 10.102699 lycurgus 10.102699 lyne 10.102699 lynx 10.102699 lzenden 10.102699 School = capitalism : asending: idf_weights price 3.314472 great 3.461602 money 3.493019 labour 3.519292 value 3.533730 country 3.547677 capital 3.593662 produce 3.643409 quantity 3.715497 greater 3.757046 decending: idf_weights intermeddle 10.115755 embezzling 10.115755 emergence 10.115755 emerged 10.115755 embroidery 10.115755 embroideries 10.115755 produit 10.115755 embodiment 10.115755 embodies 10.115755 embellish 10.115755 School = stoicism : asending: idf_weights thou 2.513984 things 2.696306 unto 2.935710 man 3.243632 thy 3.273823 thee 3.349406 nature 3.357704 doth 3.486485 good 3.714379 thyself 3.726356 decending: idf_weights abandoned 8.145196 mature 8.145196 markets 8.145196 marks 8.145196 marriage 8.145196 marriages 8.145196 marrying 8.145196 mart 8.145196 masons 8.145196 mastery 8.145196 School = nietzsche : asending: idf_weights man 3.694052 thou 4.014281 zarathustra 4.127189 life 4.164929 ye 4.198710 good 4.301462 like 4.307492 things 4.359210 did 4.391575 god 4.398176 decending: idf_weights abandon 9.820921 mandates 9.820921 maligns 9.820921 malthus 9.820921 manageable 9.820921 manages 9.820921 managing 9.820921 mandarins 9.820921 mandeikagati 9.820921 malices 9.820921 School = feminism : asending: idf_weights woman 2.892623 women 2.960395 man 3.393879 men 3.810874 life 3.988035 love 4.075918 like 4.222154 does 4.253599 black 4.316657 mother 4.391904 decending: idf_weights kidnapping 10.139703 macedonia 10.139703 longevity 10.139703 longhand 10.139703 longings 10.139703 longshoreman 10.139703 longueville 10.139703 lookers 10.139703 lookout 10.139703 loomed 10.139703
From the above TF-IDF statistics and graphs, idf_weight for a word is defined as the importance of that word across the document. The lowest means the most important. The idf_weight of 'man' belongs in the top 10 lowest level, which means "man" is significant in Philosophy. The highest idf_weight is around ten among every school. However, idf_weight of man is 3.9 on Plato, 3.75 on Aristotle, 3.6 on Empiricism, 3.98 on Rationalism, 4.39 on Continential, 3.24 on Stoicism, 3.6 on Nietzsche, and 3.39 on Feminism. We can conclude that "man" appears in almost every school in our dataset.
From studying this philosophy dataset, I concluded that "man" is more important in the world of Philosophy. Our history reflects that man was dominating the world in the past. From human nature to understand this question, it can also be that "man" is just a human being; it can help us act and think critically.